feat: support structured outputs (response_format) in chat completions#43
feat: support structured outputs (response_format) in chat completions#43giwaov wants to merge 5 commits intoOpenGradient:mainfrom
Conversation
Wire the OpenAI-compatible response_format parameter through the chat completion pipeline: - Bind response_format to LangChain model via model.bind() for json_object and json_schema types (text is a no-op) - Apply to both streaming and non-streaming code paths - Include response_format in the canonical request dict so TEE hashing covers the requested output format - Add 14 unit tests covering parsing, hash-dict serialization, model binding, and interaction with tool calling Closes OpenGradient#14
There was a problem hiding this comment.
Pull request overview
This PR adds OpenAI-compatible structured outputs support to the chat completions pipeline by threading the response_format request parameter through to LangChain model invocation and including it in the TEE request hash.
Changes:
- Bind
response_formatonto the LangChain chat model (non-streaming + streaming) viamodel.bind(response_format=...)for non-textformats. - Include
response_formatin the canonical_chat_request_to_dict(...)used for deterministic TEE hashing/signing. - Add a new unit test module covering parsing, hashing inclusion/determinism, and non-streaming model binding behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
tee_gateway/controllers/chat_controller.py |
Wires response_format into both non-streaming and streaming invocation flows and into the canonical request hash dict. |
tests/test_structured_outputs.py |
Adds unit tests for request parsing, hash dict inclusion, and non-streaming bind behavior (including tool binding interaction). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Hello! @giwaov I tested this branch and ran into two issues:
Turns out Anthropic doesn't support json_body through langchain anyways... I also added streaming tests and fixed gpt-4o → gpt-4.1 in the existing tests. Fix is here: 9b5278b You can pull it in with: Please pull it and push onto this PR. Alternatively, I'll just open a PR from my commit and merge that way. Thanks again for your contribution! |
…ng tests - Route Anthropic json_schema requests through with_structured_output() instead of bind(), which Anthropic does not support for response_format. Raise a clear error for json_object (no Anthropic native equivalent). - Inject the json_schema wrapper 'name' as 'title' in the schema dict so LangChain-Anthropic can derive a function name for its tool-use mechanism. - Handle Anthropic structured output in the streaming path by invoking synchronously and emitting the result as a single SSE content chunk. - Fix OpenAPI spec: remove 'type' from required in ResponseFormatJsonSchema so json_schema requests pass connexion validation. - Fix pre-existing test breakage: gpt-4o -> gpt-4.1 (model removed from registry). - Add streaming tests: binding behaviour for all providers, Anthropic SSE chunk output, and TEE hash content correctness. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Hey @kylexqian, thanks for testing and catching those issues! Cherry-picked your fix (9b5278b) — all 24 tests passing locally, ruff format and lint clean. Pushed. |
|
Hello again @giwaov, Thanks for the quick response! Seems like some lint error got through, I also addressed one of the copilot comments. Do you mind cherry picking these two commits and adding them to your PR? Should hopefully be the last ones! |
kylexqian
left a comment
There was a problem hiding this comment.
Add the cherry-pick commits that fix a couple bugs + add tests. Other than that, looks good!
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add _normalize_response_format() helper that coerces response_format to
a plain dict regardless of whether it arrives as a dict, Pydantic model,
or other object. Apply it in both streaming/non-streaming binding paths
and in the TEE hash dict — preventing silent json_schema payload loss
when rf_dict was reconstructed as only {"type": ...}, and preventing a
potential json.dumps failure in _chat_request_to_dict on non-dict input.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Glad this is coming together! This has been a fun one to work on. With this and the other merged PRs, I've been spending a lot of time in the codebase and would love to get the Alpha OG role on Discord if that's something you guys can help with. happy to keep contributing. |
Summary
Implements OpenAI-compatible structured outputs support by wiring the
response_formatparameter through the chat completion pipeline, as requested in #14.Changes
tee_gateway/controllers/chat_controller.py_create_non_streaming_response): After tool binding, checksresponse_format. If the type isjson_objectorjson_schema, binds it to the LangChain model viamodel.bind(response_format=...). Thetexttype is a no-op (default behavior)._create_streaming_response): Identical logic applied after tool binding._chat_request_to_dict): Includesresponse_formatin the canonical serialized dict so the TEE signature covers the requested output format.tests/test_structured_outputs.py14 unit tests covering:
response_formatfrom request dicts (text, json_object, json_schema, and absent)bind_toolsandbind(response_format=...)chain correctly)Design Decisions
llm_backend.py: Theresponse_formatis bound per-request viamodel.bind()after retrieving the cached model, following the same pattern already used for tool binding. This keeps the LRU cache clean (keyed only on model/temperature/max_tokens).response_formatdict is forwarded as-is to LangChain, which handles provider-specific translation. This maintains OpenAI API compatibility and works with all supported providers (OpenAI, Anthropic, Google, xAI).Supported Formats
Per the OpenAPI spec already defined in the repo:
{type: text}plain text (default, no-op){type: json_object}JSON mode{type: json_schema, json_schema: {name: ..., schema: {...}, strict: true}}strict schema-constrained outputCloses #14